The remark-parse package is a plugin for the Remark processor that parses Markdown content into a syntax tree. It is part of the unified ecosystem, which provides a way to parse, transform, and stringify content using abstract syntax trees (ASTs).

What are remark-parse's main functionalities?

Parsing Markdown

This feature allows you to parse Markdown content and transform it into an abstract syntax tree (AST). The code sample demonstrates how to use remark-parse with the remark library to parse a simple Markdown string.

const remark = require('remark');
const parse = require('remark-parse');

remark().use(parse).process('# Hello world!', function(err, file) {
  if (err) throw err;
  console.log(file);
});

Extensible Markdown Parsing

remark-parse can be extended with plugins to handle custom Markdown syntax. In this example, the 'remark-math' plugin is used to parse mathematical expressions within the Markdown content.

const remark = require('remark');
const parse = require('remark-parse');
const math = require('remark-math');

remark().use(parse).use(math).process('Euler's identity: $e^{i\pi} + 1 = 0$', function(err, file) {
  if (err) throw err;
  console.log(file);
});

Other packages similar to remark-parse

remark-parse

Parser for unified. Parses Markdown to mdast syntax trees. Used in the remark processor but can be used on its own as well. Can be extended to change how markdown is parsed.

Install

npm:

npm install remark-parse

Use

var unified = require('unified')
var createStream = require('unified-stream')
var markdown = require('remark-parse')
var remark2rehype = require('remark-rehype')
var html = require('rehype-stringify')

var processor = unified()
  .use(markdown, {commonmark: true})
  .use(remark2rehype)
  .use(html)

process.stdin.pipe(createStream(processor)).pipe(process.stdout)

See unified for more examples »

API

See unified for API docs »

`processor().use(parse[, options])`

Configure the processor to read Markdown as input and process mdast syntax trees.

`options`

Options can be passed directly, or passed later through processor.data().

`options.gfm`

GFM mode (boolean, default: true).

hello ~~hi~~ world

Turns on:

`options.commonmark`

CommonMark mode (boolean, default: false).

This is a paragraph
    and this is also part of the preceding paragraph.

Allows:

Empty lines to split blockquotes
Parentheses (( and )) around link and image titles
Any escaped ASCII punctuation character
Closing parenthesis ()) as an ordered list marker
URL definitions (and footnotes, when enabled) in blockquotes

Disallows:

Indented code blocks directly following a paragraph
ATX headings (# Hash headings) without spacing after opening hashes or and before closing hashes
Setext headings (Underline headings\n---) when following a paragraph
Newlines in link and image titles
White space in link and image URLs in auto-links (links in brackets, < and >)
Lazy blockquote continuation, lines not preceded by a greater than character (>), for lists, code, and thematic breaks

`options.footnotes`

Footnotes mode (boolean, default: false).

Something something[^or something?].

And something else[^1].

[^1]: This reference footnote contains a paragraph...

    * ...and a list

Enables reference footnotes and inline footnotes. Both are wrapped in square brackets and preceded by a caret (^), and can be referenced from inside other footnotes.

`options.pedantic`

Pedantic mode (boolean, default: false).

Check out some_file_name.txt

Turns on:

Emphasis (_alpha_) and importance (__bravo__) with underscores in words
Unordered lists with different markers (*, -, +)
If commonmark is also turned on, ordered lists with different markers (., ))
And removes less spaces in list items (at most four, instead of the whole indent)

`options.blocks`

Blocks (Array.<string>, default: list of block HTML elements).

<block>foo
</block>

Defines which HTML elements are seen as block level.

`parse.Parser`

Access to the parser, if you need it.

Extending the Parser

Typically, using transformers to manipulate a syntax tree produces the desired output. Sometimes, such as when introducing new syntactic entities with a certain precedence, interfacing with the parser is necessary.

If the remark-parse plugin is used, it adds a Parser constructor function to the processor. Other plugins can add tokenizers to its prototype to change how Markdown is parsed.

The below plugin adds a tokenizer for at-mentions.

module.exports = mentions

function mentions() {
  var Parser = this.Parser
  var tokenizers = Parser.prototype.inlineTokenizers
  var methods = Parser.prototype.inlineMethods

  // Add an inline tokenizer (defined in the following example).
  tokenizers.mention = tokenizeMention

  // Run it just before `text`.
  methods.splice(methods.indexOf('text'), 0, 'mention')
}

`Parser#blockTokenizers`

Map of names to tokenizers (Object.<Function>). These tokenizers (such as fencedCode, table, and paragraph) eat from the start of a value to a line ending.

See #blockMethods below for a list of methods that are included by default.

`Parser#blockMethods`

List of blockTokenizers names (Array.<string>). Specifies the order in which tokenizers run.

Precedence of default block methods is as follows:

newline
indentedCode
fencedCode
blockquote
atxHeading
thematicBreak
list
setextHeading
html
footnote
definition
table
paragraph

`Parser#inlineTokenizers`

Map of names to tokenizers (Object.<Function>). These tokenizers (such as url, reference, and emphasis) eat from the start of a value. To increase performance, they depend on locators.

See #inlineMethods below for a list of methods that are included by default.

`Parser#inlineMethods`

List of inlineTokenizers names (Array.<string>). Specifies the order in which tokenizers run.

Precedence of default inline methods is as follows:

escape
autoLink
url
html
link
reference
strong
emphasis
deletion
code
break
text

`function tokenizer(eat, value, silent)`

There are two types of tokenizers: block level and inline level. Both are functions, and work the same, but inline tokenizers must have a locator.

The following example shows an inline tokenizer that is added by the mentions plugin above.

tokenizeMention.notInLink = true
tokenizeMention.locator = locateMention

function tokenizeMention(eat, value, silent) {
  var match = /^@(\w+)/.exec(value)

  if (match) {
    if (silent) {
      return true
    }

    return eat(match[0])({
      type: 'link',
      url: 'https://social-network/' + match[1],
      children: [{type: 'text', value: match[0]}]
    })
  }
}

Tokenizers test whether a document starts with a certain syntactic entity. In silent mode, they return whether that test passes. In normal mode, they consume that token, a process which is called “eating”.

Locators enable inline tokenizers to function faster by providing where the next entity may occur.

Signatures

Node? = tokenizer(eat, value)
boolean? = tokenizer(eat, value, silent)

Parameters

eat (Function) — Eat, when applicable, an entity
value (string) — Value which may start an entity
silent (boolean, optional) — Whether to detect or consume

Properties

locator (Function) — Required for inline tokenizers
onlyAtStart (boolean) — Whether nodes can only be found at the beginning of the document
notInBlock (boolean) — Whether nodes cannot be in blockquotes, lists, or footnote definitions
notInList (boolean) — Whether nodes cannot be in lists
notInLink (boolean) — Whether nodes cannot be in links

Returns

boolean?, in silent mode — whether a node can be found at the start of value
Node?, In normal mode — If it can be found at the start of value

`tokenizer.locator(value, fromIndex)`

Locators are required for inline tokenizers. Their role is to keep parsing performant.

The following example shows a locator that is added by the mentions tokenizer above.

function locateMention(value, fromIndex) {
  return value.indexOf('@', fromIndex)
}

Locators enable inline tokenizers to function faster by providing information on where the next entity may occur. Locators may be wrong, it’s OK if there actually isn’t a node to be found at the index they return.

Parameters

value (string) — Value which may contain an entity
fromIndex (number) — Position to start searching at

Returns

number — Index at which an entity may start, and -1 otherwise.

`eat(subvalue)`

var add = eat('foo')

Eat subvalue, which is a string at the start of the tokenized value.

Parameters

subvalue (string) - Value to eat

Returns

add.

`add(node[, parent])`

var add = eat('foo')

add({type: 'text', value: 'foo'})

Add positional information to node and add node to parent.

Parameters

node (Node) - Node to patch position on and to add
parent (Parent, optional) - Place to add node to in the syntax tree. Defaults to the currently processed node

Returns

Node — The given node.

`add.test()`

Get the positional information that would be patched on node by add.

Returns

Position.

`add.reset(node[, parent])`

add, but resets the internal position. Useful for example in lists, where the same content is first eaten for a list, and later for list items.

Parameters

node (Node) - Node to patch position on and insert
parent (Node, optional) - Place to add node to in the syntax tree. Defaults to the currently processed node

Returns

Node — The given node.

Turning off a tokenizer

In some situations, you may want to turn off a tokenizer to avoid parsing that syntactic feature.

Preferably, use the remark-disable-tokenizers plugin to turn off tokenizers.

Alternatively, this can be done by replacing the tokenizer from blockTokenizers (or blockMethods) or inlineTokenizers (or inlineMethods).

The following example turns off indented code blocks:

remarkParse.Parser.prototype.blockTokenizers.indentedCode = indentedCode

function indentedCode() {
  return true
}

Security

As Markdown is sometimes used for HTML, and improper use of HTML can open you up to a cross-site scripting (XSS) attack, use of remark can also be unsafe. When going to HTML, use remark in combination with the rehype ecosystem, and use rehype-sanitize to make the tree safe.

Use of remark plugins could also open you up to other attacks. Carefully assess each plugin and the risks involved in using them.

Contribute

See contributing.md in remarkjs/.github for ways to get started. See support.md for ways to get help. Ideas for new plugins and tools can be posted in remarkjs/ideas.

A curated list of awesome remark resources can be found in awesome remark.

This project has a Code of Conduct. By interacting with this repository, organisation, or community you agree to abide by its terms.

License

Keywords

FAQs

What is remark-parse?

Is remark-parse well maintained?

Package last updated on 09 Nov 2019

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

remark-parse

What is remark-parse?

What are remark-parse's main functionalities?

Other packages similar to remark-parse

markdown-it

marked

remark-parse

Sponsors

Install

Use

Table of Contents

API

processor().use(parse[, options])

options

options.gfm

options.commonmark

options.footnotes

options.pedantic

options.blocks

parse.Parser

Extending the Parser

Parser#blockTokenizers

Parser#blockMethods

Parser#inlineTokenizers

Parser#inlineMethods

function tokenizer(eat, value, silent)

Signatures

Parameters

Properties

Returns

tokenizer.locator(value, fromIndex)

Parameters

Returns

eat(subvalue)

Parameters

Returns

add(node[, parent])

Parameters

Returns

add.test()

Returns

add.reset(node[, parent])

Parameters

Returns

Turning off a tokenizer

Security

Contribute

License

Keywords

Related posts

Introducing License Enforcement in Socket

Enhancing Open-Source Compliance: Introducing Socket’s Advanced License Analysis

Introducing Socket Optimize

`processor().use(parse[, options])`

`options`

`options.gfm`

`options.commonmark`

`options.footnotes`

`options.pedantic`

`options.blocks`

`parse.Parser`

`Parser#blockTokenizers`

`Parser#blockMethods`

`Parser#inlineTokenizers`

`Parser#inlineMethods`

`function tokenizer(eat, value, silent)`

`tokenizer.locator(value, fromIndex)`

`eat(subvalue)`

`add(node[, parent])`

`add.test()`

`add.reset(node[, parent])`